Summarization of Web Pages by Keyword Extraction and Sentence Vector

نویسندگان

  • Md. Shafayat Rahman
  • Ashequl Qadir
  • Md. Mohsin
  • Ali Khan
  • Abdullah Azfar
چکیده

In this paper we are trying to propose a system that can run in parallel with the usual search engine to provide the user with unified and summarized information. Our system will relieve the user of manual accessing of each of the web links that is produced by the search result of a search engine. To implement such feature in the search process, here we propose a procedure that can identify the significant words from the web pages found by giving a search string in any popular search engine. These keywords will then be used to extract the significant information from the web pages and to eliminate extraneous information by determining the sentence vectors.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparison of Word- and Term-based Methods for Automatic Web Site Summarization

Automatic Web site summarization is an effective means of making the content of a web site easily accessible to Web users. We demonstrate that a content-based approach to summarization, which is based on keyword and key sentence extraction from narrative text, is able to generate summaries that are as informative as human authored summaries. This work is directed towards summary generation base...

متن کامل

A Comparison of Keyword- and Keyterm-based Methods for Automatic Web Site Summarization

Automatic Web site summarization, which is based on keyword and key sentence extraction from narrative text, is an effective means of making the content of a Web site easily accessible to Web users. This work is directed towards summary generation based on multi-word terms extracted by the C-value/NC-value method. Keyterm-based summaries are compared with keyword-based summaries for a list of t...

متن کامل

Comparing Key Phrase Extraction Methods in Automatic Web Site Summarization

We benchmark five methods, TFIDF, KEA, Keyword, Keyterm, and Mixture, for key phrase extraction in the automatic Web site summarization task. We investigate the performance of these methods via a formal user study and demonstrate that Keyterm is the best method for extracting key phrases while Mixture is the best one for obtaining key sentences.

متن کامل

Automatic Generation of Term Descriptions by Web-based Multi-Document Summarization

We developed a Web search system called " Cyclone " , which extracts high quality term descriptions and helps users to obtain encyclopedic knowledge efficiently. In the current implementation , multiple paragraph-style descriptions extracted from different Web pages are presented in response to a user keyword. However, to obtain sufficient information for a single keyword, a user usually has to...

متن کامل

ASHRAM: Active Summarization and Markup

Typically, searching for information in a document collection amounts to refining a query and then scanning a large number of documents to determine their relevance. Active Summarization Having Related Active Markup (ASHRAM) is a facility for representing and automatically selecting, marking, and linking useful and/or salient items in a document, to make it easier for the user to determine the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006